Exploiting lexical regularities in designing natural language systems

نویسندگان

  • Boris Katz
  • Beth Levin
چکیده

This paper presents the lexical component of the STAttT Question Answering system developed at the MIT Artificial Intelligence Laboratory. START is able to interpret correctly a wide range of semantic relationships associated with alternate expressions of the arguments of verbs. The design of the system takes advantage of the results of recent linguistic research into the structure of the lexicon, allowing START to attain a broader range of coverage than many existing systems while maintaining modular organization. 1. I n t r o d u c t i o n If asked "Did Sally eat?" after having been told that Sally ate a pear, speakers of English would not hesitate to answer "Yes". But we would not expect English speakers to answer "Yes" if asked "Did David dress?" after being told that David dressed the baby. Here the appropriate answer would be "I don't know". Computational linguists engaged in building Question-Answering systems should find these examples thought-provoking. Two sequences consisting of a statement followed by a question which appear to be parallel syntactically (transitive use of a verb in the statement, intransitive use of the same verb in the question) elicit quite different responses. The simple syntax of these pairs is unlikely to pose a challenge for the parsers used in most existing systems. The problem is that the intransitive uses of the two verbs, eat and dress, receive very different interpretations. Thus the intransitive use of eat found in the question "Did Sally eat?" implies the existence of an understood but unexpressed 'object that is interpreted as a prototyplcal type of food or a meal: (1) Sally ate a pear. ~ Sally ate. (i.e., Sally ate some food.) The question "Did David dress?" on the other hand does not mean ~Did David dress something one typically dresses?', it means 'Did David dress himself?': (2) David dressed the baby, =7~ David dressed (i.e., David dressed himself.) Na tura l language systems should be able to recognize that the relationship between transitive and intransitive dress is not the same as that between transitive and intransitive eat. A large number of English verbs have both transitive and intransitive uses. Interchanges parallel to the one described for eat are possible with a wide range of verbs: (3) Jessiea typed a letter. Did Jessica type? Yes. (4) Gabriella swept the floor. Did she sweep? Yes. (5) Miriam read the book. Did Miriam read? Yes. But the behavior of the verb dress is not exceptional. Another set of verbs including bathe, change, shave, shower, and wash behave like it. For example, these verbs show the same entailments as dress: (6) Carla bathed the dog. :=~ Carla bathed (i.e., Carla bathed herself.) (7) Jill washed the sweater. =fi~ Jill washed (i.e., Jill washed herself.) (8) Peter shaved Tom. ~ Peter shaved (!.e., Peter shaved himself.) The different relationships between transitive and intransi~ give uses of verbs cannot be ignored in the design of a natural language system and its lexical component, The most obvious way to handle these relationships is to add information to the lexical entries of each verb with transitive and intransitive uses. While such an approach is viable when a system has a smM1 lex~ icon~ it becomes less tractable as the lexicon grows larger since it requires a tremendous increase in the amount of idiosyncratic information which must be registered in the entry of ea(-h verb. The examples discussed so far illustrate just a few of a wide range of relationships between alternate expressions of the arguments of verbs that must be correctly interpreted by any natural language system that alms at substantial coverage of English. We believe that what is required in order to implement a system that meets these demands is an understanding of English lexical organization. For this reason we draw on recent theoretleal linguistic investigations into the lexical knowledge possessed by native speakers of English carried out by the MIT Lexicon Project (Rappaport, Levin, and Laughren [1988], Levin [1985], Hale and Keyser [1986], Levin and Rappaport [to appear]). These studies have established a range of semanticsyntactic interdependencies exhibited by semantically coherent classes of verbs and have identified a number of essential classes of verbs, as well as the central properties characterizing verbs of each type. The results of this work have been used in the design of a lexical component for the START natural language system developed at the MIT Artificial Intelligence Laboratory (Katz [1988]). In this paper we show how these resttlgs allow STAtt2[' to attain a broader range of coverage than most existing systems while maintaining modular organization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Distributional and Mirror Translation Similarities for Extracting Synonyms

Automated thesaurus construction by collecting relations between lexical items (synonyms, antonyms, etc) has a long tradition in natural language processing. This has been done by exploiting dictionary structures or distributional context regularities (coocurrence, syntactic associations, or translation equivalents), in order to define measures of lexical similarity or relatedness. Dyvik had pr...

متن کامل

The Use of Lexical Bundles in Native and Non-native Post-graduate Writing: The Case of Applied Linguistics MA Theses

Connor et al. (2008) mention “specifying textual requirements of genres” (p.12) as one of the reasons which have motivated researchers in the analysis of writing. Members of each genre should be able to produce and retrieve these textual requirements appropriately to be considered communicatively proficient. One of the textual requirements of genres is regularities of specific forms and content...

متن کامل

Exploring linguistically-rich patterns for question generation

Linguistic patterns reflect the regularities of Natural Language and their applicability is acknowledged in several Natural Language Processing tasks. Particularly, in the task of Question Generation, many systems depend on patterns to generate questions from text. The approach we follow relies on patterns that convey lexical, syntactic and semantic information, automatically learned from large...

متن کامل

Exploiting Context to Identify Lexical Atoms - A Statistical View of Linguistic Context

Interpretation of natural language is inherently context-sensitive. Most words in natural language are ambiguous and their meanings are heavily dependent on the linguistic context in which they are used. The study of lexical semantics can not be separated from the notion of context. This paper takes a contextual approach to lexical semantics and studies the linguistic context of lexical atoms, ...

متن کامل

Applying Semantic Frame Theory to Automate Natural Language Template Generation From Ontology Statements

Today there exist a growing number of framenet-like resources offering semantic and syntactic phrase specifications that can be exploited by natural language generation systems. In this paper we present on-going work that provides a starting point for exploiting framenet information for multilingual natural language generation. We describe the kind of information offered by modern computational...

متن کامل

Exploiting Sublanguage and Domain Characteristics in a Bootstrapping Approach to Lexicon and Ontology Creation

It is very costly to build up lexical resources and domain ontologies. Especially when confronted with a new application domain lexical gaps and a poor coverage of domain concepts are a problem for the successful exploitation of natural language document analysis systems that need and exploit such knowledge sources. In this paper we report about ongoing experiments with ‘bootstrapping technique...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1988